Rank | Count | Beginning |
---|---|---|
8134 | 6880 | В |
32792 | 2151 | И |
54879 | 1885 | Но |
3007 | 1863 | А |
48675 | 1773 | На |
96418 | 1554 | Это |
27836 | 1486 | Если |
98551 | 1154 | Я |
38105 | 1123 | Как |
66208 | 1062 | По |
47544 | 976 | Мы |
60440 | 939 | Он |
24768 | 933 | Для |
78585 | 901 | С |
52670 | 754 | Не |
72953 | 695 | При |
17068 | 688 | Все |
89757 | 654 | У |
69391 | 619 | После |
37713 | 577 | К |
85319 | 576 | Так |
59195 | 539 | Однако |
94438 | 499 | Что |
71265 | 465 | Поэтому |
61050 | 462 | Они |
40278 | 459 | Когда |
30350 | 405 | За |
60378 | 401 | Она |
20052 | 396 | Вы |
14757 | 368 | Вот |
In the next four subsections show the most frequent sentence beginnings consisting of N words, N=1, 2, 3, 4. In this subsection we start with N=1.
The most frequent word-N-grams at the beginning of sentences give some insight into sentence composition.
Especially for N=1, we only need a small corpus to identify the most frequent sentence beginnings.
select substring_index(sentence, ' ', 1) as beg, count(*) as cnt from sentences group by substring_index(sentence, ' ', 1) order by cnt desc limit 50;
4.3.1.2 Most Frequent Sentence Beginnings II
4.3.1.3 Most Frequent Sentence Beginnings III
4.3.1.4 Most Frequent Sentence Beginnings IV
4.3.1.1 Most Frequent Sentence Endings I
4.3.1.2 Most Frequent Sentence Endings II
4.3.1.3 Most Frequent Sentence Endings III
4.3.1.4 Most Frequent Sentence Endings IV